home *** CD-ROM | disk | FTP | other *** search
-
- Lexical analysis is the process of converting an input stream of
- characters into a stream of words or tokens. Tokens are groups of
- characters with collective significance. Lexical analysis is the first
- stage of automatic indexing, and of query processing. Automatic
- indexing is the process of algorithmically examining information items
- to generate lists of index terms. The lexical analysis phase produces
- candidate index terms that may be further processed, and eventually added
- to indexes. Query processing is the activity of analyzing a query and
- comparing it to indexes to find relevant items. Lexical analysis of a
- query produces tokens that are parsed and turned into an internal
- representation suitable for comparison with indexes.
-
- In automatic indexing, candidate index terms are often checked to see
- whether they are in a stop list, or negative dictionary. Stop list words
- are known to make poor index terms, and they are immediately removed from
- further consideration as index terms when they are identified.
-
-